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AMENDMENT TO THE CLAIMS 



WHAT IS CLAIMED IS: 

1. (Currently amended) A method for estimating naturalness of 
synthesized speech, wherein naturalness is a subjective quality of 
synthesized speech, the method comprising: 

generating a set of synthesized utterances from textual 
information ; 

subjectively rating each of the synthesized utterances; 
calculating a score for each of the synthesized utterances 

using an objective measure, the objective measure being 

a function of textual information derived from used to 

form the utterances; 
ascertaining a relationship between the scores of the 

objective measure and subjective ratings of the 

synthesized utterances; and 
using the relationship to estimate naturalness of synthesized 

speech. 

2. (Currently amended) The method of claim i 30 wherein the 
objective measure comprises an indication of a position of a 
speech unit in a phrase . 

3. (Currently amended) The method of claim i 3_0 wherein the 
objective measure comprises an indication of a position of a 
speech unit in a word. 

4. (Currently amended) The method of claim i 30 wherein the 
objective measure comprises an indication of a category for a 
phoneme preceding a speech unit. 

5. (Currently amended) The method of claim i 30 wherein the 
objective measure comprises an indication of a category for a 
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phoneme following a speech unit. 

6. (Currently amended) The method of claim £ 30 wherein the 
objective measure comprises an indication of a category for the 
tone of a preceding speech unit . 

7. (Currently amended) The method of claim i 30 wherein the 
objective measure comprises an indication of a category for the 
tone of a following speech unit . 

8. (Currently amended) The method of claim £ 30 wherein the 
objective measure comprises an indication of a prosodic mismatch 
between successive speech units. 

9. (Currently amended) The method of claim i 30 wherein the 
objective measure comprises an indication of level of stress of a 
speech unit . 

10. (Currently amended) The method of claim i 30 wherein the 
objective measure score for each synthesized utterance is a 
function of a length of said each synthesized utterance. 

11. (Original) The method of claim 10 wherein the length 
comprises a number of speech units in an utterance. 

12. (Currently Amended) The method of claim i 30 wherein 
calculating a score includes generating context vectors for each 
synthesized utterance wherein the context vectors comprises 
comprise at least two coordinates of textual information from a 
set including: 

an indication of a position of a speech unit in a phrase; 

an indication of a position of a speech unit in a word; 

an indication of a category for a phoneme preceding a speech 
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unit ; 

an indication of a category for a phoneme following a speech 
unit; 

an indication of a category for the tone of a preceding 
speech unit ; 

an indication of a category for the tone of a following 

speech unit; and 
an indication of a level of stress of a speech unit; and 
an indication of a degree of coupling with a neighboring 

speech unit. 

13. (Currently amended) The method of claim 12 wherein 
calculating a score includes generating context vectors for each 
of the synthesized utterances wherein the context vectors 
comprioQO comprise at least three coordinates of textual 
information from the set . 

14. (Currently amended) The method of claim 12 wherein 
calculating a score includes generating context vectors for each 
of the synthesized utterances wherein the context vectors 
comprises comprise at least four coordinates of textual 
information from the set. 

15. (Cancelled) 



16. (Currently amended) The method of claim 12 wherein 
calculating a score includes generating context vectors for each 
of the synthesized utterances wherein the context vectors 
comprises comprise at least six coordinates of textual information 
from the set . 
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17. (Original) The method of claim 12 wherein the objective 
measure includes an indication of prosodic mismatch of successive 
speech units. 

18. (Original) The method of claim 12 wherein the coordinates are 
weighted. 

19. (Currently Amended) A method for developing a speech 
synthesizer, the method comprising: 

obtaining a set of synthesized utterances based on textual 
information from the speech synthesizer; 

subjectively rating naturalness of each of the synthesized 
utterances; 

calculating a score for each of the synthesized utterances 
using an objective measure, the objective measure being 
a function of textual information of speech units for 
each of the utterances; 

ascertaining a relationship between the scores of the 
objective measure and ratings of the synthesized 
utterances ; 

varying a parameter of the speech synthesizer; 

obtaining speech units for another utterance after the 

parameter of the speech synthesizer has been varied; 

and 

calculating a second score for said another utterance using 

the objective measure; and 
using the relationship and the second score to estimate 

naturalness of said another utterance. 

20. (Currently amended) The method of claim i& 31 wherein 
obtaining speech units for another utterance includes obtaining 
speech units for a second set of utterances, wherein calculating a 
second score includes calculating corresponding scores for each of 
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the utterances of the second set of utterances, and wherein using 
the relationship includes using the relationship to estimate 
naturalness of each of said second set of utterances . 

21. (Currently amended) The method of claim i-9- 31 wherein the 
parameter comprises an amount of speech units available for 
synthesis . 

22. (Currently amended) The method of claim JcS- 31 wherein the 
parameter comprises an algorithm for selecting speech units. 

23. (Currently amended) The method of claim 31 wherein 
calculating a score includes generating context vectors for each 
synthesized utterance wherein the context vectors compriaca 
comprise at least two coordinates of textual information from a 
set including: 

an indication of a position of a speech unit in a phrase; 
an indication of a position of a speech unit in a word; 
an indication of a category for a phoneme preceding a speech 
unit ; 

an indication of a category for a phoneme following a speech 
unit; 

an indication of a category for the tone of a preceding 
speech unit; 

an indication of a category for a tone of a following speech 
unit; and 

an indication of a level of stress of a speech unit. 

24. (Currently amended) The method of claim 23 wherein 
calculating a score includes generating context vectors for each 
of the synthesized utterances wherein the context vectors 
compriaca comprise at least three coordinates of textual 
information from the set. 
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25. (Currently amended) The method of claim 23 wherein 
calculating a score includes generating context vectors for each 
of the synthesized utterances wherein the context vectors 
compriaco comprise at least four coordinates of textual 
information from the set. 

26. (Currently amended) The method of claim 23 wherein 
calculating a score includes generating context vectors for each 
of the synthesized utterances wherein the context vectors 
comprises comprise at least five coordinates of textual 
information from the set. 

27. (Cancelled) 

28. (Original) The method of claim 23 wherein the objective 
measure includes an indication of prosodic mismatch of successive 
speech units . 

29. (Original) The method of claim 23 wherein the coordinates are 
weighted. 

30. (New) The method of claim 1 wherein the objective measure is 
a function of a concatenative cost of the textual information used 
to form words in the utterances. 

31. (New) The method of claim 19 wherein the objective measure is 
a function of a concatenative cost of the textual information used 
to form a word in each utterance . 



